Context Connectors #53

igor0 · 2025-12-15T20:53:51Z

An open-source library built on the Context Engine SDK that makes diverse sources searchable across agents and apps

Context Connectors enables users to:

Build indexes from Git repos (GitHub, GitLab, BitBucket), documentation websites, or local filesystem: index code, documentation, runbooks, schemas, configs, and more. Use DirectContext in the Context Engine SDK for custom sources.
Store indexes on a local filesystem for fast & simple access, or in S3 for persistent storage in production apps.
Search indexes via interactive agent, MCP for AI integrations, CLI for quick searches, or DirectContext in the Context Engine SDK for custom implementations.

Switch from openai() to openai.chat() to use the Chat Completions API instead of the Responses API. The Responses API is stateful and generates server-side IDs (fc_...) for function calls that are not persisted for Zero Data Retention (ZDR) organizations, causing multi-step tool calls to fail. The Chat Completions API is stateless and works correctly with ZDR.

- OpenAI: gpt-5.2 → gpt-5-mini - Anthropic: claude-sonnet-4-5 → claude-haiku-4-5 - Google: gemini-3-pro → gemini-3-flash-preview Also adds Phase 10 test results documenting: - ZDR compatibility fix (openai.chat vs openai) - Model availability testing - Multi-provider verification

- Add ./clients export path to package.json for programmatic API access - Export createMCPServer, runMCPServer, MCPServerConfig from clients module - Document Phase 11 programmatic API test results in test-results.md

Flip the default behavior: file tools (listFiles, readFile) are now enabled by default. Use --search-only to disable them. This is more intuitive - users get full functionality by default and explicitly opt out when they only want the search tool. - cmd-mcp: --search-only disables list_files/read_file tools - cmd-agent: --search-only disables listFiles/readFile tools - cmd-search: --search-only disables file access

…ansport - Add mcp-http-server.ts with runMCPHttpServer() and createMCPHttpServer() - Add mcp-serve CLI command with --port, --host, --cors, --base-path, --api-key options - Support API key authentication via Authorization: Bearer header - Support CORS for browser-based clients - Update README with HTTP server documentation and examples

Updated tool descriptions for search, list_files, and read_file to be more detailed and informative, adapting from Auggie CLI while keeping content appropriate for context-connectors: - Added multi-line descriptions with features and usage notes - Included condensed regex syntax guide for searchPattern - Clarified parameter semantics (1-based, inclusive, relative paths) - Removed coding-specific language to support general use cases Files modified: - src/clients/mcp-server.ts - src/clients/cli-agent.ts

…-specific store paths - Rename -k, --key flag to -n, --name across all CLI commands - Change default store location from CWD-relative .context-connectors to: - Linux: ~/.local/share/context-connectors (XDG Base Directory spec) - macOS: ~/Library/Application Support/context-connectors - Windows: %LOCALAPPDATA%\context-connectors - Add CONTEXT_CONNECTORS_STORE_PATH environment variable override - Priority order: --store-path CLI option > env var > platform default - Update README.md with new flag, Data Storage section, and env var docs

Replace importFromFile with temp file pattern with direct import() call. This is platform-neutral and avoids unnecessary filesystem operations.

Replace flat --source flag with subcommands for each source type: - index filesystem (alias: fs) - index github - index gitlab - index bitbucket - index website Each subcommand now shows only relevant options in --help. Extracted shared store options into reusable helper functions.

- Make --name optional for mcp and mcp-serve commands (default: all indexes) - Accept multiple index names with -n/--name <names...> - Discover available indexes at startup from store - Include available indexes in tool descriptions - Add index_name parameter to all tools (search, list_files, read_file) - Lazy initialization of SearchClient per index on first use - Cache initialized clients for reuse

Show source type, identifier, and relative sync time for each index: NAME SOURCE SYNCED augment-docs github://augmentcode/docs 1d ago lm-plot github://igor0/lm-plot 1d ago

- Change SourceMetadata to discriminated union with typed config per source - Store original ref (branch/tag) separately from resolvedRef (SHA) - Enable future re-indexing by preserving all source parameters - Add getSourceIdentifier() and getResolvedRef() helper functions - Maintain backward compatibility with legacy format indexes - Update all sources, consumers, and tests

- sync <name> updates a single index using stored config - sync --all updates all indexes - Keeps 'index' command name (clearer than 'add')

- Add explicit path format examples with ✅/❌ to prevent /repo confusion - Add example output for search results - Use consistent 'repository root' terminology - Add clearer parameter documentation with inline defaults - Improve regex guidance with specific unsupported pattern examples

- Use loadState() instead of load() for tests that need full state - Use loadSearch() for tests that need search-only state

- Change CLI option from -n, --name to -i, --index - Named indexes now stored in {basePath}/indexes/{key}/ instead of {basePath}/{key}/ - File-based mode (no --index) still stores directly in basePath - Update list() to look in indexes/ subdirectory

- Add -i/--index option as alias for -n/--name in both commands - When no --index is provided, load directly from {store-path}/search.json - Use '.' as key for file-based mode (files go directly in store-path) - Update error messages to show correct location when index not found - Fix filesystem.test.ts to use indexes/ subdirectory structure

- Move 'list' and 'delete' under 'index' command (index list, index delete) - Restructure MCP commands: 'mcp' -> 'mcp local', 'mcp-serve' -> 'mcp remote' - Remove obsolete 'sync' command (replaced by 'index github --index <n>') - Consolidate cmd-list.ts, cmd-delete.ts, cmd-mcp-serve.ts into their parent commands - Update main entry point to reflect new command structure

- Updated Indexer to call context.export() with 'full' and 'search-only' modes - Updated IndexStore.save() to accept both fullState and searchState - Updated FilesystemStore, MemoryStore, S3Store to save both state files - SearchClient now uses loadSearch() for search-only state - Removed manual blob stripping - SDK handles export modes natively - Updated tests to use new dual-state pattern

- Add 'local' command with 'list' and 'delete' subcommands for managing local indexes - Add s3-config.ts helper to read S3 config from CC_S3_* environment variables

…deduplication - Add --save-content debug option to save crawled content for inspection - Add progress reporting during upload and indexing phases - Show clear summary of new vs unchanged files - Reuse previous context state for client-side deduplication - Cache crawl results to avoid redundant re-crawling

…rogress Update to use new SDK progress API that provides separate uploaded and indexed counters instead of a single stage-dependent processed counter.

- Add unified --index <specs...> option for mcp local, mcp remote, and agent - Support index spec formats: name, path:/path, s3://bucket/key - Add CompositeStoreReader for routing to different stores based on specs - Add MultiIndexRunner for shared multi-index client management - Extract tool descriptions to shared tool-descriptions.ts module - Agent command now supports multiple indexes like MCP commands - All tools include index_name parameter when multiple indexes specified

- Use -i, --index <spec> with same format as other commands - Remove deprecated -n/--name and --store options - Remove --path override (source path comes from index metadata)

Named indexes now always use the default path (~/.augment/context-connectors). Users can use path:/custom/path for custom locations.

- Remove --search-only (not meaningful for search command) - Add --raw flag for raw search results - Default behavior uses searchAndAsk() to answer questions via LLM - Add searchAndAsk() method to SearchClient

- Add runtime console warnings when binding to non-localhost interfaces - Warn that HTTP traffic is unencrypted and API keys transmitted in cleartext - Suggest production alternatives: TLS reverse proxy, VPN, SSH tunneling - Add Security Considerations section to README with: - Caddy and nginx reverse proxy examples - SSH tunneling instructions - Network isolation guidance - API key generation recommendations

Align CLI naming with MCP SDK transport terminology: - 'mcp local' is now 'mcp stdio' - 'mcp remote' is now 'mcp http' Updated all README examples to use new command names.

- Add header: 'Context Connectors Minimal Agent' - Add two newlines after each agent turn for better readability - Only show tool calls in verbose mode

- Run interactively by default, non-interactively with --print flag - Query is now an optional positional argument (works in both modes) - In interactive mode with query: asks query first, then prompts - In --print mode without query: exits with error

augment-app-staging

Review completed. 4 suggestions posted.

Comment augment review to trigger a new review at any time.

augment-app-staging · 2026-01-08T22:24:22Z

context-connectors/src/stores/types.ts

+   * @param key - The index key/name
+   * @returns The stored IndexState (without blobs), or null if not found
+   */
+  loadSearch(key: string): Promise<IndexState | null>;


loadSearch() is documented as returning the search-optimized state without blobs, but the type is Promise<IndexState | null>; this makes it easy to accidentally treat a search-only state as a full IndexState (and e.g. assume contextState.blobs exists). Using IndexStateSearchOnly here would better reflect the actual contract.

Other Locations

context-connectors/src/stores/filesystem.ts:81

context-connectors/src/stores/memory.ts:40

context-connectors/src/stores/s3.ts:141

context-connectors/src/clients/search-client.ts:20

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augment-app-staging · 2026-01-08T22:24:22Z

context-connectors/src/stores/s3.ts

+    ]);
+  }
+
+  async delete(key: string): Promise<void> {


delete() removes only state.json; search.json is left behind, so list() can continue to return this key and search clients may read stale state. This can make deletions appear to “not work” in S3-backed setups.

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augment-app-staging · 2026-01-08T22:24:22Z

context-connectors/src/sources/github.ts

+      });
+
+      stream.pipe(parser);
+      parser.on("close", resolve);


In downloadTarball(), the promise only rejects on stream errors; if the tar parser emits an error (corrupt/partial archive), this can hang indefinitely waiting for close. Consider handling parser errors as well so failures surface reliably.

Other Locations

context-connectors/src/sources/gitlab.ts:253

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augment-app-staging · 2026-01-08T22:24:22Z

context-connectors/src/sources/github.ts

+        removed.push(file.filename);
+      } else if (file.status === "added" || file.status === "modified" || file.status === "renamed") {
+        // Download file contents
+        const contents = await this.getFileContents(file.filename, currentRef);


fetchChanges() downloads and indexes changed files without applying the same .augmentignore/.gitignore + shouldFilterFile() checks used in full indexing, so incremental updates can index files that full indexing would intentionally skip (binary/oversized/ignored). This can lead to inconsistent indexes and occasional ingest errors.

Other Locations

context-connectors/src/sources/gitlab.ts:404

context-connectors/src/sources/bitbucket.ts:455

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

igor0 · 2026-01-09T06:36:33Z

augment review

augment-app-staging

Review completed. 7 suggestions posted.

Comment augment review to trigger a new review at any time.

augment-app-staging · 2026-01-09T06:45:16Z

context-connectors/src/sources/bitbucket.ts

+   */
+  private async readFileRawBuffer(path: string, ref: string): Promise<Buffer | null> {
+    try {
+      const url = `${this.baseUrl}/repositories/${this.workspace}/${this.repo}/src/${encodeURIComponent(ref)}/${encodeURIComponent(path)}`;


encodeURIComponent(path) will encode / as %2F, which likely breaks Bitbucket’s /src/{commit}/{path} endpoint for nested paths (files in subdirectories may become unreadable).

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augment-app-staging · 2026-01-09T06:45:16Z

context-connectors/src/sources/bitbucket.ts

+
+    try {
+      // Clone with depth 1 for efficiency, then checkout the specific ref
+      execSync(`git clone --depth 1 --branch ${ref} "${cloneUrl}" "${tempDir}"`, {


ref here is a commit SHA (from resolveRefToSha()), but git clone --branch expects a branch/tag name; this clone step is likely to fail, and the follow-up git fetch origin ${ref} may also be rejected by the remote.

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augment-app-staging · 2026-01-09T06:45:16Z

context-connectors/src/sources/github.ts

+    });
+
+    // Download tarball
+    const response = await fetch(url);


This fetch(url) doesn’t include GitHub auth headers; for private repos the tarball endpoint/redirect is likely to 404/403, causing full indexing to fail.

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augment-app-staging · 2026-01-09T06:45:16Z

context-connectors/src/clients/mcp-http-server.ts

+  const parseBody = (req: IncomingMessage): Promise<unknown> => {
+    return new Promise((resolve, reject) => {
+      let body = "";
+      req.on("data", (chunk) => (body += chunk));


parseBody buffers the entire request body with no size limit; if this server is exposed, a large POST could cause memory exhaustion/DoS.

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augment-app-staging · 2026-01-09T06:45:16Z

context-connectors/src/tools/read-file.ts

+  if (output.length <= maxLength) {
+    return { text: output, truncated: false };
+  }
+  const truncateAt = maxLength - TRUNCATION_MESSAGE.length;


If maxOutputLength is smaller than TRUNCATION_MESSAGE.length, truncateAt becomes negative and slice(0, truncateAt) can produce unexpected output (not a clean truncation).

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augment-app-staging · 2026-01-09T06:45:16Z

context-connectors/src/tools/read-file.ts

+  contextAfter: number
+): { lineNumbers: Set<number>; matchingLines: Set<number> } {
+  const flags = caseSensitive ? "g" : "gi";
+  const regex = new RegExp(pattern, flags);


new RegExp(pattern, flags) isn’t guarded; an invalid user-supplied regex pattern will throw and crash the tool call instead of returning a structured error.

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augment-app-staging · 2026-01-09T06:45:16Z

context-connectors/src/clients/multi-index-runner.ts

+    // Load metadata for available indexes
+    const indexes: IndexInfo[] = [];
+    for (const name of indexNames) {
+      const state = await store.loadSearch(name);


If store.list() includes an index name but loadSearch(name) is missing (e.g., partial state), indexNames still includes it while indexes metadata omits it, so the server may advertise an index that later errors when used.

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

igor0 added 30 commits December 15, 2025 06:19

Context Connectors

96a5de5

.gitignore

50e9b3b

Remove AI SDK support

5d28d71

Test webhook

1490467

Fixes

5a6114e

Add ./clients export and MCP functions, document Phase 11 results

bf436a3

- Add ./clients export path to package.json for programmatic API access - Export createMCPServer, runMCPServer, MCPServerConfig from clients module - Document Phase 11 programmatic API test results in test-results.md

Remove test-results.md

7afb760

docs: add Phase 12 edge cases and error handling test results

8180d1b

Improvements

6185e23

Pagination fixes:

e057fb4

Fixes

f66fcc5

Remove unnecessary changes

3c1a44c

Tweak README.md

2c9abc0

Fix

4cc7f78

Use DirectContext.import() instead of temp file workaround

1109e41

Replace importFromFile with temp file pattern with direct import() call. This is platform-neutral and avoids unnecessary filesystem operations.

Improvements

aebdfd3

More fine-grained indexing updates to stdout

aed8199

feat: enhance list command with source metadata

df99b2c

Show source type, identifier, and relative sync time for each index: NAME SOURCE SYNCED augment-docs github://augmentcode/docs 1d ago lm-plot github://igor0/lm-plot 1d ago

feat: add sync command for updating indexes

1aeb63a

- sync <name> updates a single index using stored config - sync --all updates all indexes - Keeps 'index' command name (clearer than 'add')

Update comments

4546072

igor0 added 22 commits January 4, 2026 21:56

test(context-connectors): update tests for new store API

758e2a4

- Use loadState() instead of load() for tests that need full state - Use loadSearch() for tests that need search-only state

Remove init command (GitHub Actions support deferred)

118cf08

Add local command and S3 env var config

81a5a75

- Add 'local' command with 'list' and 'delete' subcommands for managing local indexes - Add s3-config.ts helper to read S3 config from CC_S3_* environment variables

feat(context-connectors): use separate uploaded/indexed counters in p…

7c3980c

…rogress Update to use new SDK progress API that provides separate uploaded and indexed counters instead of a single stage-dependent processed counter.

Unify search CLI with index spec format

46f3308

- Use -i, --index <spec> with same format as other commands - Remove deprecated -n/--name and --store options - Remove --path override (source path comes from index metadata)

Remove --store-path option from all commands

ab731e0

Named indexes now always use the default path (~/.augment/context-connectors). Users can use path:/custom/path for custom locations.

Refactor search command to use searchAndAsk by default

b213a7e

- Remove --search-only (not meaningful for search command) - Add --raw flag for raw search results - Default behavior uses searchAndAsk() to answer questions via LLM - Add searchAndAsk() method to SearchClient

Rename mcp subcommands: local→stdio, remote→http

6c1d067

Align CLI naming with MCP SDK transport terminology: - 'mcp local' is now 'mcp stdio' - 'mcp remote' is now 'mcp http' Updated all README examples to use new command names.

Improve agent CLI output formatting

96c58cc

- Add header: 'Context Connectors Minimal Agent' - Add two newlines after each agent turn for better readability - Only show tool calls in verbose mode

Improve agent CLI output formatting

6ece8e2

- Add header: 'Context Connectors Minimal Agent' - Add two newlines after each agent turn for better readability - Only show tool calls in verbose mode

Depend on published SDK

d9faccf

README update

33db628

Fix GitHub workflow

840e077

Address review feedback

c008b23

igor0 marked this pull request as ready for review January 8, 2026 21:25

augment-app-staging bot reviewed Jan 8, 2026

View reviewed changes

igor0 added 4 commits January 9, 2026 01:46

Address Auggie feedback

f4afb5e

Auggie feedback - tar library error handling

33510c6

Auggie review: better error handling

2040872

Auggie feedback: filter files on incremental updates

e066348

augment-app-staging bot reviewed Jan 9, 2026

View reviewed changes

Context Connectors #53

Are you sure you want to change the base?

Context Connectors #53

Uh oh!

Conversation

igor0 commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

augment-app-staging bot left a comment

Choose a reason for hiding this comment

Uh oh!

augment-app-staging bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

augment-app-staging bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

augment-app-staging bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

augment-app-staging bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

igor0 commented Jan 9, 2026

Uh oh!

augment-app-staging bot left a comment

Choose a reason for hiding this comment

Uh oh!

augment-app-staging bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

augment-app-staging bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

augment-app-staging bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

augment-app-staging bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

augment-app-staging bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

augment-app-staging bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

augment-app-staging bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

igor0 commented Dec 15, 2025 •

edited

Loading